Convolutional Pitch Target Approximation Model for Speech Synthesis

نویسندگان

  • Xingyu Na
  • Philip N. Garner
چکیده

In this paper, we investigate pitch contour modelling in speech synthesis based on segmental units. A convolutional pitch target approximation model is proposed. This model allows jointly stochastic modelling of framewise pitch and pitch contour of longer units, of which the intuitive relations are revealed by a convolutional target approximation filter. The pitch contour is stylized by a linear representation called pitch target. In synthesis stage, the likelihood of the framewise model and the pitch target model are jointly maximized using a Toeplitz matrix representing the discrete convolutional filter. Index Terms Pitch modelling, speech synthesis, pitch target approximation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Personalizing a speech synthesizer by voice adaptation

A voice adaptation system enables users to quickly create new voices for a text-to-speech system, allowing for the personalization of the synthesis output. The system adapts to the pitch and spectrum of the target speaker, using a probabilistic, locally linear conversion function based on a Gaussian Mixture Model. Numerical and perceptual evaluations reveal insights into the correlation between...

متن کامل

Modeling Pitch Contour of Chinese Mandarin Sentences with the PENTA Model

In continuous speech, the pitch contour of the same syllable may vary much due to its contextual information. The Parallel Encoding and Target Approximation (PENTA) model is applied here to Mandarin speech synthesis with a method to predict pitch contours for Chinese syllables with different contexts by combining the Classification And Regression Tree (CART) with the PENTA model to improve its ...

متن کامل

Modeling Pitch Contour of Chinese Mandarin Sentence with PENTA Model

In continuous speech, it is believed that the pitch contour of the same syllable may vary a lot due to its different context information. To apply the Parallel Encoding and Target Approximation (PENTA) model to Mandarin speech synthesis and improve its prediction accuracy, this paper proposed a method to predict pitch contours for Chinese syllables with different contexts by combining the Class...

متن کامل

Modeling Speech Melody as Communicative Functions with PENTAtrainer2

This paper presents PENTAtrainer2, a semi-automatic software package written as Praat plug-in integrated with Java programs, and its applications for analysis and synthesis of speech melody as communicative functions. Its core concepts are based on the Parallel Encoding and Target Approximation (PENTA) framework, the quantitative Target Approximation (qTA) model, and the simulated annealing opt...

متن کامل

Pitch target analysis of Thai tones using quantitative target approximation model and unsupervised clustering

This paper presents the integration between the quantitative target approximation (qTA) model and the unsupervised clustering technique to study Thai tones. The qTA model simulates F0 production on the basis of articulation process. Parameters extracted from the F0 of Thai speech by analysisand-synthesis method were further analyzed by K-means clustering. The number and form of pitch target wer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013